1 Linear Regression

Linear regression is a means to test a relationship between independent variables(s) (i.e., manipulated variable(s)) and one dependent variable (response variable). Depending on the conditions of the independent variable(s) you may be testing a model that could be called a linear regression, ANOVA, or ANCOVA. We will discuss these differences and similarities. These models also have specific assumptions that must be met to produce estimates of the relationship that are not biased and useful in creating predictions and interpreting the results.

1.1 One-Way ANOVA

The ANOVA model is a unique class of linear models where the response variable is quantitative (continuous) and the predictors are one or more nominal (categorical) variables. The term ANOVA comes from Analysis Of Variance, which is a method for separating the variance of a group of observations, which was invented by Ronald Fisher (over a century ago).

ANOVA is further categorized by the number of nominal variables included in the analysis. A one-way ANOVA has a single predictor. The following example shows the important sequential steps that should be used to complete a one-way ANOVA.

1.1.1 Review the Experimental Design

By reviewing the data and its characteristics you can confirm the appropriate analysis and understand the context of the hypothesis. This will also be helpful when considering how to visualize the data to help interpret the results.

This example is a pot-experiment to compare weed control efficacy of two herbicides used alone and in mixture. A control was also added as a reference. The four treatments are 1) Metribuzin, 2) Rimsulfuron, 3) Metribuzin+Rimsulfuron, and 4) Untreated Control.

Sixteen uniform pots were prepared and sown with Solanum nigrum. When the plants reached the 4-true-leaves stage they were randomly sprayed with one of the above herbicide solutions following a completely randomized design (will be covered in more detail during the Experimental Design lecture) with four replicates. Three weeks after the treatment, the plants in each pot were harvested and weighed. The theory dictates that the lower the weight the higher the efficacy of herbicides. Our hypothesis was that there was a difference in the response (i.e., weight) among the different treatments. Identifying a treatment that produces the lowest weight would suggest better control of weed growth.

1.1.1.1 Importing the Data

When organizing data (commonly in an excel file) note that the dataset is in a “tidy” format, with one row per observation and one column per variable. This is known as the long format and is what most functions expect to be used in R. Other data formats might be useful for some visualizations, but the data can be transformed into other formats when necessary.

One_Way_Anova <- read_csv(here("data", "One_Way_ANOVA.csv"))
## Rows: 16 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Treat
## dbl (1): Weight
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

1.1.1.2 Data Description

The summary function of the data object produces summary statistics of each variable in the dataset. We can also use the tapply function to produce the treatment means and standard deviations for each treatment level. This can also be done using the dplyr function summarize. We get a glimpse at the treatment averages and the variation in standard deviation between treatments. We observed differences in the sample means, but our goal is to determine differences in the population means.

summary(One_Way_Anova)
##     Treat               Weight      
##  Length:16          Min.   : 1.950  
##  Class :character   1st Qu.: 6.635  
##  Mode  :character   Median :12.850  
##                     Mean   :14.484  
##                     3rd Qu.:21.560  
##                     Max.   :30.940
treatMeans <- tapply(One_Way_Anova$Weight, One_Way_Anova$Treat, mean)
SDs <- tapply(One_Way_Anova$Weight, One_Way_Anova$Treat, sd)
descrit <- data.frame(treatMeans, SDs)
descrit
treatMeans SDs
Metribuzin__348 9.1750 4.699089
Mixture_378 5.1275 2.288557
Rimsulfuron_30 16.8600 4.902353
Unweeded 26.7725 3.168674

1.1.1.3 Fitting ANOVA Models with R

Fitting a linear model with weight as the response variable and the factor of Treatment as the independent variable. You can either call the variable as a factor within the lm function, or you can convert the variable from a character to a factor. You can use the str function or look at the data object in the Environment tab.

str(One_Way_Anova)
## spc_tbl_ [16 × 2] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Treat : chr [1:16] "Metribuzin__348" "Metribuzin__348" "Metribuzin__348" "Metribuzin__348" ...
##  $ Weight: num [1:16] 15.2 4.38 10.32 6.8 6.14 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Treat = col_character(),
##   ..   Weight = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>
mod <- lm(Weight ~ factor(Treat), data = One_Way_Anova)

One_Way_Anova$Treat <- factor(One_Way_Anova$Treat)
str(One_Way_Anova)
## spc_tbl_ [16 × 2] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Treat : Factor w/ 4 levels "Metribuzin__348",..: 1 1 1 1 2 2 2 2 3 3 ...
##  $ Weight: num [1:16] 15.2 4.38 10.32 6.8 6.14 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Treat = col_character(),
##   ..   Weight = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>
mod <- lm(Weight ~ Treat, data = One_Way_Anova)

Following the linear model being fit, you can call the parameter estimates using the summary function. You can see the call which shows the terms included in the model, the summary statistics of the residuals, a table of coefficients, and some other statistical results. This summary output is less useful for ANOVA, but we will revisit this output for linear regression.

summary(mod)
## 
## Call:
## lm(formula = Weight ~ Treat, data = One_Way_Anova)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -6.360 -2.469  0.380  2.567  6.025 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            9.175      1.959   4.684 0.000529 ***
## TreatMixture_378      -4.047      2.770  -1.461 0.169679    
## TreatRimsulfuron_30    7.685      2.770   2.774 0.016832 *  
## TreatUnweeded         17.598      2.770   6.352 3.65e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.918 on 12 degrees of freedom
## Multiple R-squared:  0.8554, Adjusted R-squared:  0.8193 
## F-statistic: 23.66 on 3 and 12 DF,  p-value: 2.509e-05

After fitting a model the values you are most interested in reviewing are the fitted values and the residuals, which can be extracted using the fitted() and residuals() functions. The fitted values are the same as the means calculated above. The residuals are the deviation each observation is away from the mean value, and will be used for later steps.

expected <- fitted(mod) #Not too useful but included here to show how to extract these values
epsilon <- residuals(mod)

The ANOVA table is commonly obtained using one of two methods from either the stats or car package. Certain model types we use in future more complicated models may prefer one over the other. There may be some guidance we can provide, but other situations may require some troubleshooting. It will be helpful to remember these options.

stats::anova(mod)
Df Sum Sq Mean Sq F value Pr(>F)
Treat 3 1089.5287 363.17624 23.66259 2.51e-05
Residuals 12 184.1774 15.34812 NA NA
car::Anova(mod)
Sum Sq Df F value Pr(>F)
Treat 1089.5287 3 23.66259 2.51e-05
Residuals 184.1774 12 NA NA

This is the type of output that is more often reported for ANOVA models. The interpretation of the F statistic and the p value follow commonly adopted thresholds where a p value < 0.05 is considered significant. The norm is to report the numerator and denominator degrees of freedom, the F value, and the p value (e.g., F(3,12)=23.663, p value < 0.0001). This says that the F-statistic from this model is 23.663 is larger than the critical value from the F-statistic table with numerator degrees of freedom 3 and denominator degrees of freedom 12. This test only tells us that there is a significant difference among the levels of the categorical variable. It doesn’t say which levels are significantly different.

1.1.1.4 Assumptions

The results of the model are only valid when the assumptions for linear models are met. These are sometimes called the Gauss-Markov Assumptions or BLUE (Best Linear Unbiased Estimator). These assumptions will be the same for ANOVA, ANCOVA, and linear regression.

Some of the assumptions have more to deal with how the data are collected and the experiment is designed - variation in the independent variables, random sampling, and linearity in parameters. The other assumptions are assessed after the model is tested - normally distributed errors, and homoskedasticity. There are other assumptions that are more theoretical in natures, and wont be discussed here.

The assessment of the residuals of the linear model can either be inspected visually in a graph, or by objective, formal hypotheses. The graphical methods are powerful in detecting strong deviations form the basic assumptions, and the formal hypotheses can be used when there are less clear answers.

To assess the normality of errors (residuals) we often use a QQ (quantile-quantile)-plot. The standardized residuals are plotted against the respective percentiles of a normal distribution. This can either be achieved using the qqnorm and qqline functions with the residuals - we called epsilon earlier, or using the ggplot function with the rstandard(mod) code. Make sure to run the qqnorm and qqline lines together.

qqnorm(epsilon)
qqline(epsilon)

ggplot() +
  geom_qq(aes(sample=rstandard(mod))) +
  geom_abline(color="red") +
  theme_classic()

The goal is to observe the points fall along the line as close as possible. Some deviation from the line is normal, but look out for strong evidence of a skewed distribution where the tails are further from the line.

To assess homoskedasticity we look at the plot of the residuals against the fitted values. If the assumption of no systematic error and homogeneous errors is met, then the residuals should be randomly scattered without any visible systematic patterns. We can look for potential outliers, funnel-like patter suggesting heteroscedasticity, or a positive/negative trend in the residuals over the fitted values.

plot(residuals(mod) ~ fitted(mod))
abline(h = 0)

Alternatively you can use the autoplot function from the ggfortify package to look at a comprehensive output and review the assumptions.

autoplot(mod)
## Warning: `fortify(<lm>)` was deprecated in ggplot2 3.6.0.
## ℹ Please use `broom::augment(<lm>)` instead.
## ℹ The deprecated feature was likely used in the ggfortify package.
##   Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the ggfortify package.
##   Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## ℹ The deprecated feature was likely used in the ggfortify package.
##   Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

A formal hypothesis used to test the linear model assumptions for normally distributed residuals is the Shapiro-Wilks test on the residuals of the model. The null hypothesis is that the residuals are normally distributed. A significant result p value < 0.05 indicates the residuals are not normally distributed.

shapiro.test(epsilon)
## 
##  Shapiro-Wilk normality test
## 
## data:  epsilon
## W = 0.97615, p-value = 0.9257

A formal hypothesis used to test the linear model assumptions for homogeneity of variance is the Levene’s test on the independent variables in the model. The null hypothesis is that the variance across all samples are equal. A significant result p value < 0.05 indicates there is one sample with a different variance. This is a useful test when you have a categorical independent variable where the residuals vs fitted plot would be more useful for a model with only continuous variables.

car::leveneTest(Weight ~ Treat, One_Way_Anova, center = mean)
Df F value Pr(>F)
group 3 1.356301 0.3028878
12 NA NA

1.1.1.5 Expected Marginal Means

The expected marginal means will produce the arithmetic means when the experiment is balanced (same number of replicates within each group), but the means will be different when the experiment is unbalanced. We will use the emmeans() function to produce the expected marginal means (also known as least squares means).

mu_j <- mod %>%
  emmeans( ~ Treat)
mu_j
##  Treat           emmean   SE df lower.CL upper.CL
##  Metribuzin__348   9.18 1.96 12     4.91     13.4
##  Mixture_378       5.13 1.96 12     0.86      9.4
##  Rimsulfuron_30   16.86 1.96 12    12.59     21.1
##  Unweeded         26.77 1.96 12    22.50     31.0
## 
## Confidence level used: 0.95

1.1.1.6 Multiple Comparisons

Following the evidence to support the linear model meets the assumptions, further analysis can be completed to produce unbiased estimates. The rejection of the null hypothesis that all treatment groups are equal suggests that there is at least one difference. The process to determine which treatment groups are significantly different from the other is known as post-hoc multiple comparisons tests.

Two general approaches for multiple comparisons tests include linear contrasts and pairwise comparisons.

Linear contrasts are a linear combination of means (model parameters) that add up to zero. One example from the herbicide treatments would be to compare the three herbicide treatments to the control 1/3+1/3+1/3-1=0 Each level of the treatment variable are given a constant that informs what levels are being compared. Here the average of the three treatments are being compared to the control. You can test multiple types of contrasts, but the constants have to sum to zero - this is known as orthogonal contrasts.

cont1 <- c(1/3, 1/3, 1/3, -1) #the average of the first three levels compared to the fourth
cont2 <- c(1/2, -1, 1/2, 0) #the average of the first and third level compared to the second
cont3 <- c(0, -1, 1, 0) #the second level compared to the third
cont4 <- c(1, -1, 0, 0) #the first level compared to the second
contrasts <- list(cont1, cont2, cont3, cont4)
contrast(mu_j, method = contrasts, adjust = "none") #emmeans from expected marginal means
##  contrast                                                      estimate   SE df
##  c(0.333333333333333, 0.333333333333333, 0.333333333333333, -1   -16.39 2.26 12
##  c(0.5, -1, 0.5, 0)                                                7.89 2.40 12
##  c(0, -1, 1, 0)                                                   11.73 2.77 12
##  c(1, -1, 0, 0)                                                    4.05 2.77 12
##  t.ratio p.value
##   -7.244  <.0001
##    3.289  0.0065
##    4.235  0.0012
##    1.461  0.1697

The pairwise comparisons are more commonly used (sometimes overused) and follow two general approaches: 1) all pairwise comparisons, and comparisons with a control (Dunnett’s method). All pairwise comparisons will yield a large number of contrasts, and some might not be of primary interest based on the hypothesis. Carefully consider these steps within the context of your data and hypotheses. Furthermore, the “dunnett” method defaults to comparing each level to the first level in alphabetical order, which may not be the control level. You can call the levels of the treatment variable from the dataset object and find that the fourth level is the control.

contrast(mu_j, adjust = "none", method = "pairwise")
##  contrast                         estimate   SE df t.ratio p.value
##  Metribuzin__348 - Mixture_378        4.05 2.77 12   1.461  0.1697
##  Metribuzin__348 - Rimsulfuron_30    -7.68 2.77 12  -2.774  0.0168
##  Metribuzin__348 - Unweeded         -17.60 2.77 12  -6.352  <.0001
##  Mixture_378 - Rimsulfuron_30       -11.73 2.77 12  -4.235  0.0012
##  Mixture_378 - Unweeded             -21.64 2.77 12  -7.813  <.0001
##  Rimsulfuron_30 - Unweeded           -9.91 2.77 12  -3.578  0.0038
levels(One_Way_Anova$Treat)
## [1] "Metribuzin__348" "Mixture_378"     "Rimsulfuron_30"  "Unweeded"
contrast(mu_j, adjust = "none", method = "dunnett", ref = 4)
##  contrast                   estimate   SE df t.ratio p.value
##  Metribuzin__348 - Unweeded   -17.60 2.77 12  -6.352  <.0001
##  Mixture_378 - Unweeded       -21.64 2.77 12  -7.813  <.0001
##  Rimsulfuron_30 - Unweeded     -9.91 2.77 12  -3.578  0.0038

With a large number of contrasts it can be impractical to report the effect size of all comparisons. The cld (compact letter display) function helps to visualize the differences between the treatment levels. You can produce either lowercase letters (letters) or uppercase letters (LETTERS). You can also have the letters follow increasing or decreasing emmean values. Be mindful that the letter display doesn’t inform anything about the direction and size of the differences, which is it’s chief criticism in their use to present experimental results. This criticism can be alleviated when the letters are showed along side the raw data.

mu_j <- mod %>%
  emmeans(specs = "Treat") %>%
  cld(adjust = "none", Letters = letters)
mu_j
Treat emmean SE df lower.CL upper.CL .group
2 Mixture_378 5.1275 1.958834 12 0.8595676 9.395432 a
1 Metribuzin__348 9.1750 1.958834 12 4.9070676 13.442932 a
3 Rimsulfuron_30 16.8600 1.958834 12 12.5920676 21.127932 b
4 Unweeded 26.7725 1.958834 12 22.5045676 31.040432 c
mu_j_reversed <- mod %>%
  emmeans(specs = "Treat") %>%
  cld(adjust = "none", Letters = letters, reversed = TRUE)
mu_j_reversed
Treat emmean SE df lower.CL upper.CL .group
2 Unweeded 26.7725 1.958834 12 22.5045676 31.040432 a
1 Rimsulfuron_30 16.8600 1.958834 12 12.5920676 21.127932 b
3 Metribuzin__348 9.1750 1.958834 12 4.9070676 13.442932 c
4 Mixture_378 5.1275 1.958834 12 0.8595676 9.395432 c

In experiments that have a high number of contrasts or simultaneously tested hypotheses, there is the potential for an increased type I error. This is known as the multiplicity problem. The adjust=“none” code above produces the results of the multiple comparisons without and adjustment to the p values and is known as Fisher’s Least Significant Difference (LSD). This is the least conservative approach to multiple comparisons given that there is no attempt to correct for the increased type I error. This approach has been aruged as appropriate when the p value of an F-test is highly significant, but this is up for debate.

Other approaches make adjustments to the p values and the confidence intervals of the pairwise comparisons. The Sidak method is one approach that can be used with the following argument - adjust=“sidak”.

mu_j <- mod %>%
  emmeans(specs = "Treat") %>%
  cld(adjust = "sidak", Letters = letters)
mu_j
Treat emmean SE df lower.CL upper.CL .group
2 Mixture_378 5.1275 1.958834 12 -0.6004451 10.85545 a
1 Metribuzin__348 9.1750 1.958834 12 3.4470549 14.90295 ab
3 Rimsulfuron_30 16.8600 1.958834 12 11.1320549 22.58795 b
4 Unweeded 26.7725 1.958834 12 21.0445549 32.50045 c

Another example is the Bonferroni correction is simpler adjustment procedure and more commonly used.

mu_j <- mod %>%
  emmeans(specs = "Treat") %>%
  cld(adjust = "bonferroni", Letters = letters)
mu_j
Treat emmean SE df lower.CL upper.CL .group
2 Mixture_378 5.1275 1.958834 12 -0.6206176 10.87562 a
1 Metribuzin__348 9.1750 1.958834 12 3.4268824 14.92312 ab
3 Rimsulfuron_30 16.8600 1.958834 12 11.1118824 22.60812 b
4 Unweeded 26.7725 1.958834 12 21.0243824 32.52062 c

The last multiplicity adjustment we will discuss is the option I most commonly see: Tukey’s HSD. This option is the default method - we just remove the adjust argument. All three are seen as conservative adjustments. You can also see the individual comparisons by including details=TRUE option.

mu_j<-mod %>%
  emmeans(specs = "Treat") %>%
  cld(Letters = letters, details = F)
mu_j
Treat emmean SE df lower.CL upper.CL .group
2 Mixture_378 5.1275 1.958834 12 0.8595676 9.395432 a
1 Metribuzin__348 9.1750 1.958834 12 4.9070676 13.442932 ab
3 Rimsulfuron_30 16.8600 1.958834 12 12.5920676 21.127932 b
4 Unweeded 26.7725 1.958834 12 22.5045676 31.040432 c

We can use the results of the multiple comparisons test to help answer our hypotheses.

1.1.1.7 Data Visualization

mu_j <- mu_j %>%
  as_tibble()

Plot_means_tukey <- ggplot() +
  geom_point(data = One_Way_Anova, aes(y = Weight, x = Treat), position = position_jitter(width = 0.1)) + #dots representing the raw data
  geom_boxplot(data = One_Way_Anova, aes(y = Weight, x = Treat), position = position_nudge(x = -0.25), width = 0.25, outlier.shape = NA) + #boxplot
  geom_point(data = mu_j, aes(y = emmean, x = Treat), position = position_nudge(x = 0.15), size = 2, color = "red") + # red dots representing the adjusted means
  geom_errorbar(data = mu_j, aes(ymin = lower.CL, ymax = upper.CL, x = Treat), position = position_nudge(x = 0.15), color = "red", width = 0.1) + # red error bars representing the confidence limits of the adjusted means
  geom_text(data=mu_j,aes(y=emmean,x=Treat,label=str_trim(.group)),position=position_nudge(x=0.25),color="black",angle=0)+ # red letters 
  labs(y="Weed Weight (g)",x="Treatment")+
  theme_classic()
Plot_means_tukey

All treatments are better than the control (i.e., unweeded). Mixture_378 is a better treatment than Rimsulfuron_30, and Metribuzin_348 is no better than either of the treatments.

2 Two-Way ANOVA

When you have multiple independent variables in the linear model that are all categorical this is known as a two-way ANOVA. The following steps demonstrate how to perform an analysis of a two-way ANOVA.

2.0.0.1 Review the Experimental Design

Review the data and its characteristics you can confirm the appropriate analysis and understand the context of the hypothesis. This will also be helpful when considering how to visualize the data to help interpret the results.

These data show the body mass in grams among different penguin species on different islands and between sexes of penguins. There are other variables recorded that could be analyzed, but we will be focusing on the species, sex, and body mass (g) of the penguins.

2.0.0.2 Importing the Data

Remember the “tidy” format…

Two_Way_Anova <- read_csv(here("data", "Two_Way_ANOVA.csv"))
## Rows: 344 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): species, island, sex
## dbl (5): bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g, year
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Two_Way_Anova <- subset(Two_Way_Anova, !is.na(sex)) # removes NAs within the sex variable

2.0.0.3 Data Description

The full dataset contains eight variables for 333 penguins (without NAs). Make sure to check the state/type of each variable in the dataframe to make sure the categorical variables are considered as Factors. You can either look in the Environment tab or use the str() function.

summary(Two_Way_Anova)
##    species             island          bill_length_mm  bill_depth_mm  
##  Length:333         Length:333         Min.   :32.10   Min.   :13.10  
##  Class :character   Class :character   1st Qu.:39.50   1st Qu.:15.60  
##  Mode  :character   Mode  :character   Median :44.50   Median :17.30  
##                                        Mean   :43.99   Mean   :17.16  
##                                        3rd Qu.:48.60   3rd Qu.:18.70  
##                                        Max.   :59.60   Max.   :21.50  
##  flipper_length_mm  body_mass_g       sex                 year     
##  Min.   :172       Min.   :2700   Length:333         Min.   :2007  
##  1st Qu.:190       1st Qu.:3550   Class :character   1st Qu.:2007  
##  Median :197       Median :4050   Mode  :character   Median :2008  
##  Mean   :201       Mean   :4207                      Mean   :2008  
##  3rd Qu.:213       3rd Qu.:4775                      3rd Qu.:2009  
##  Max.   :231       Max.   :6300                      Max.   :2009
str(Two_Way_Anova)
## tibble [333 × 8] (S3: tbl_df/tbl/data.frame)
##  $ species          : chr [1:333] "Adelie" "Adelie" "Adelie" "Adelie" ...
##  $ island           : chr [1:333] "Torgersen" "Torgersen" "Torgersen" "Torgersen" ...
##  $ bill_length_mm   : num [1:333] 39.1 39.5 40.3 36.7 39.3 38.9 39.2 41.1 38.6 34.6 ...
##  $ bill_depth_mm    : num [1:333] 18.7 17.4 18 19.3 20.6 17.8 19.6 17.6 21.2 21.1 ...
##  $ flipper_length_mm: num [1:333] 181 186 195 193 190 181 195 182 191 198 ...
##  $ body_mass_g      : num [1:333] 3750 3800 3250 3450 3650 ...
##  $ sex              : chr [1:333] "male" "female" "female" "female" ...
##  $ year             : num [1:333] 2007 2007 2007 2007 2007 ...
Two_Way_Anova$species <- factor(Two_Way_Anova$species)
Two_Way_Anova$sex <- factor(Two_Way_Anova$sex)
str(Two_Way_Anova)
## tibble [333 × 8] (S3: tbl_df/tbl/data.frame)
##  $ species          : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ island           : chr [1:333] "Torgersen" "Torgersen" "Torgersen" "Torgersen" ...
##  $ bill_length_mm   : num [1:333] 39.1 39.5 40.3 36.7 39.3 38.9 39.2 41.1 38.6 34.6 ...
##  $ bill_depth_mm    : num [1:333] 18.7 17.4 18 19.3 20.6 17.8 19.6 17.6 21.2 21.1 ...
##  $ flipper_length_mm: num [1:333] 181 186 195 193 190 181 195 182 191 198 ...
##  $ body_mass_g      : num [1:333] 3750 3800 3250 3450 3650 ...
##  $ sex              : Factor w/ 2 levels "female","male": 2 1 1 1 2 1 2 1 2 2 ...
##  $ year             : num [1:333] 2007 2007 2007 2007 2007 ...

2.0.0.4 Fitting ANOVA Models with R

Fitting a linear model with body_mass_g as the response variable and the factors of species and sex as the independent categorical variables.

We are interested in determining the following 1) The variation in body mass among the different species of penguins. 2) The variation in body mass among the different sexes of penguins. 3) The variation in body mass among the different species of penguins is different for females and males (interaction).

Hypothesis tests are as follows
Main effect of sex on body mass:
H0: mean body mass is equal between females and males
H1: mean body mass is different between females and males Main effect of species on body mass
H0: mean body mass is equal between all three species H1: mean body mass is different for at least one species Interaction between sex and species:
H0: there is no interaction between sex and species, meaning that the relationship between species and body mass is the same for females and males
H1: there is an interaction between sex and species, meaning that the relationship between species and body mass is different for females than for males

mod <- lm(body_mass_g ~ species * sex, data = Two_Way_Anova)

Following the linear model being fit, we call the parameter estimates using the summary function.

summary(mod)
## 
## Call:
## lm(formula = body_mass_g ~ species * sex, data = Two_Way_Anova)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -827.21 -213.97   11.03  206.51  861.03 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               3368.84      36.21  93.030  < 2e-16 ***
## speciesChinstrap           158.37      64.24   2.465  0.01420 *  
## speciesGentoo             1310.91      54.42  24.088  < 2e-16 ***
## sexmale                    674.66      51.21  13.174  < 2e-16 ***
## speciesChinstrap:sexmale  -262.89      90.85  -2.894  0.00406 ** 
## speciesGentoo:sexmale      130.44      76.44   1.706  0.08886 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 309.4 on 327 degrees of freedom
## Multiple R-squared:  0.8546, Adjusted R-squared:  0.8524 
## F-statistic: 384.3 on 5 and 327 DF,  p-value: < 2.2e-16

And we extract the residuals values.

epsilon <- residuals(mod)

In this instance we are testing both species and sex and their interaction so the type III sums of squares is most appropriate. Type I sums of squares tests each factor sequentially, and the type II sums of squares is inappropriate when there is an interaction term.

This link can provide more details for your reference.

stats::anova(mod) #Presents type I sums of squares
Df Sum Sq Mean Sq F value Pr(>F)
species 2 145190219 72595109.56 758.358072 0.0000000
sex 1 37090262 37090261.78 387.459976 0.0000000
species:sex 2 1676557 838278.37 8.756997 0.0001973
Residuals 327 31302628 95726.69 NA NA
car::Anova(mod) #Presents type II sums of squares
Sum Sq Df F value Pr(>F)
species 143401584 2 749.015666 0.0000000
sex 37090262 1 387.459976 0.0000000
species:sex 1676557 2 8.756997 0.0001973
Residuals 31302628 327 NA NA
car::Anova(mod, type = "III")
Sum Sq Df F value Pr(>F)
(Intercept) 828480899 1 8654.648789 0.0000000
species 60350016 2 315.220419 0.0000000
sex 16613442 1 173.550777 0.0000000
species:sex 1676557 2 8.756997 0.0001973
Residuals 31302628 327 NA NA

Remember that this test only tells us that there is a significant difference among the levels of the categorical variable. It doesn’t say which levels are significantly different.

2.0.0.5 Assumptions

To assess the normality of errors (residuals) we use a QQ (quantile-quantile)-plot. The standardized residuals are plotted against the respective percentiles of a normal distribution.

qqnorm(epsilon)
qqline(epsilon)

To assess homoskedasticity we look at the plot of the residuals against the fitted values.

plot(residuals(mod) ~ fitted(mod))
abline(h = 0)

autoplot(mod)

Then the Shapiro-Wilks test on the residuals of the model.

shapiro.test(epsilon)
## 
##  Shapiro-Wilk normality test
## 
## data:  epsilon
## W = 0.99776, p-value = 0.9367

Finally the Levene’s test on the independent variables in the model.

car::leveneTest(mod)
Df F value Pr(>F)
group 5 1.390826 0.2272166
327 NA NA

Results of the above figures and tests indicate that model assumptions have been met.

2.0.0.6 Expected Marginal Means

We can call for the emmeans for both main effects and the interaction.

mu_species <- mod %>%
  emmeans(~ species) %>%
  as_tibble() %>% #Converts the object to a tibble
  arrange(emmean) #Arranges the emmean values as ascending values
## NOTE: Results may be misleading due to involvement in interactions
mu_species
species emmean SE df lower.CL upper.CL
Adelie 3706.164 25.60590 327 3655.791 3756.537
Chinstrap 3733.088 37.51993 327 3659.277 3806.899
Gentoo 5082.289 28.37142 327 5026.475 5138.102
mu_sex <- mod %>%
  emmeans(~ sex) %>%
  as_tibble() %>%
  arrange(emmean)
## NOTE: Results may be misleading due to involvement in interactions
mu_sex
sex emmean SE df lower.CL upper.CL
female 3858.594 25.33613 327 3808.752 3908.437
male 4489.100 25.15752 327 4439.609 4538.591
group_by(Two_Way_Anova, species, sex) %>%
  summarise(mean = round(mean(body_mass_g, na.rm = TRUE)),
            sd = round(sd(body_mass_g, na.rm = TRUE)))
## `summarise()` has grouped output by 'species'. You can override using the
## `.groups` argument.
species sex mean sd
Adelie female 3369 269
Adelie male 4043 347
Chinstrap female 3527 285
Chinstrap male 3939 362
Gentoo female 4680 282
Gentoo male 5485 313

2.0.0.7 Multiple Comparisons

Following the evidence to support the linear model meets the assumptions, further analysis can be completed to produce unbiased estimates. The rejection of the null hypothesis that body mass is the same across the species and sexes suggests there is a difference among the levels of one variable across the other variable. We move onto the post-hoc multiple comparisons tests.

When you have a significant interaction effect you shouldn’t analyze the main effects independently since their interpretation is dependent on the other effect. There is another option for multiple comparisons tests when you have multiple factors and interactions. You can either test all pairwise comparisons by using a colon between the two terms, or you can test pairwise comparisons of one effect within the levels of another effect by using a pipe between the two terms. We will show both methods as an example.

all_means_tukey <- mod %>%
  emmeans(~ species : sex) %>%
  cld(Letters = letters)
all_means_tukey
species sex emmean SE df lower.CL upper.CL .group
1 Adelie female 3368.836 36.21222 327 3297.597 3440.074 a
2 Chinstrap female 3527.206 53.06120 327 3422.821 3631.590 a
5 Chinstrap male 3938.971 53.06120 327 3834.586 4043.355 b
4 Adelie male 4043.493 36.21222 327 3972.255 4114.731 b
3 Gentoo female 4679.741 40.62586 327 4599.820 4759.662 c
6 Gentoo male 5484.836 39.61427 327 5406.905 5562.767 d
within_means_tukey <- mod %>%
  emmeans(~ species | sex) %>%
  cld(Letters = letters)
within_means_tukey
species sex emmean SE df lower.CL upper.CL .group
1 Adelie female 3368.836 36.21222 327 3297.597 3440.074 a
2 Chinstrap female 3527.206 53.06120 327 3422.821 3631.590 b
3 Gentoo female 4679.741 40.62586 327 4599.820 4759.662 c
5 Chinstrap male 3938.971 53.06120 327 3834.586 4043.355 a
4 Adelie male 4043.493 36.21222 327 3972.255 4114.731 a
6 Gentoo male 5484.836 39.61427 327 5406.905 5562.767 b

2.0.0.8 Data Visualization

all_means_tukey <- all_means_tukey %>%
  as_tibble() #convert object to tibble so that ggplot can read the details

Plot_all_means_tukey <- ggplot() +
  geom_point(data = Two_Way_Anova, aes(y = body_mass_g, x = species, color = sex), position = position_dodgenudge(width = 0.9, x = 0)) + #dots representing the raw data
  geom_boxplot(data = Two_Way_Anova, aes(y = body_mass_g, x = species, color = sex), width = 0.25, outlier.shape = NA, position = position_dodgenudge(width = 0.5, x = 0)) + # boxplot
  geom_point(data = all_means_tukey, aes(y = emmean, x = species, group = sex), position = position_dodgenudge(width = 0.5, x = 0), size = 2, color = "red") + # red dots representing the adjusted means
  geom_errorbar(data = all_means_tukey, aes(ymin = lower.CL, ymax = upper.CL, x = species, group = sex), position = position_dodgenudge(width = 0.5, x = 0),color = "red", width = 0.1) + # red error bars representing the confidence limits of the adjusted means
  geom_text(data = all_means_tukey, aes(y = emmean, x = species, group = sex, label = str_trim(.group)), color = "black", position = position_dodgenudge(width = 1.2, x = -0.025), hjust = 0, angle = 0) + # red letters 
  labs(y = "Body Mass (g)", x = "Species") +
  theme_classic()
Plot_all_means_tukey
## Warning: `position_dodge()` requires non-overlapping x intervals.

within_means_tukey <- within_means_tukey %>%
  as_tibble()

Plot_within_means_tukey <- ggplot() +
  facet_wrap(~ sex, labeller = label_both) + #facet per sex level
  geom_point(data = Two_Way_Anova, aes(y = body_mass_g, x = species, color = species)) + # dots representing the raw data
  geom_boxplot(data = Two_Way_Anova, aes(y = body_mass_g, x = species, color = species), width = 0.25, outlier.shape = NA, position = position_nudge(x = 0.2)) + # boxplot
  geom_point(data = within_means_tukey, aes(y = emmean, x = species), color = "red", position = position_nudge(x = 0.2)) + # red dots representing the adjusted means
  geom_errorbar(data = within_means_tukey, aes(ymin = lower.CL, ymax = upper.CL, x = species), color = "red", width = 0.1, position = position_nudge(x = 0.2)) + # red error bars representing the confidence limits of the adjusted means
  geom_text(data = within_means_tukey, aes(y = emmean, x = species, label = str_trim(.group)), color = "black", position = position_nudge(x = 0.4), hjust = 0) + # red letters 
  labs(y = "Body Mass (g)", x = "Species") +
  theme_classic() # clearer plot format 
Plot_within_means_tukey

3 Simple Linear Regression

When you have one or more independent variables in the linear model that are all continuous that is known as linear regression. Simple linear regression has one independent variable, and multiple linear regression has multiple independent variables.

3.0.0.1 Review the Experimental Design

Data from (Kniss et al. 2012) shows the sugar yield of sugar beet in response to volunteer corn density. The response variable is sucrose production in pounds per acre (LbsSucA), and the independent variable is volunteer corn density in plants per foot of row.

3.0.0.2 Importing the Data

Simple_Linear_Regression <- read_csv(here("data", "Simple_Linear_Regression.csv"))
## Rows: 24 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Loc
## dbl (2): Density, LbsSucA
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

3.0.0.3 Fitting Linear Model with R

mod <- lm(LbsSucA ~ Density, data = Simple_Linear_Regression)

Following the linear model being fit, you can call the parameter estimates using the summary function.

summary(mod)
## 
## Call:
## lm(formula = LbsSucA ~ Density, data = Simple_Linear_Regression)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2086.02 -1084.01    23.92   726.23  3007.73 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   8677.6      485.6  17.869 1.38e-14 ***
## Density     -20644.7     5645.2  -3.657  0.00139 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1383 on 22 degrees of freedom
## Multiple R-squared:  0.3781, Adjusted R-squared:  0.3498 
## F-statistic: 13.37 on 1 and 22 DF,  p-value: 0.001387

The resulting output provides estimates for the intercept (8677.6) and slope (-20644.7) of the regression line. The regression for this example would be Y (Pounds Sucrose/Acre)=8677.6-20644.7*X (Density). The interpretation of these results says that for every one unit change in the Volunteer corn density (plants/ft row) results in a 20,644.7 unit decrease in the yield. The maximum volunteer corn density used in the study was only 0.15 plants per foot of sugar beet row, so a one unit change in the density is beyond the scale of the data used in the study. We can still use the above equation to calculate the estimated pounds sucrose per acre for a fraction of a change in the density.

The ANOVA table can be used but provides little information here. Furthermore, notice that the results for the effect of density do not change between the methods used - there is only one effect and no interaction.

stats::anova(mod) #Presents type I sums of squares
Df Sum Sq Mean Sq F value Pr(>F)
Density 1 25572276 25572276 13.37385 0.0013869
Residuals 22 42066424 1912110 NA NA
car::Anova(mod) #Presents type II sums of squares
Sum Sq Df F value Pr(>F)
Density 25572276 1 13.37385 0.0013869
Residuals 42066424 22 NA NA
car::Anova(mod, type = "III")
Sum Sq Df F value Pr(>F)
(Intercept) 610547289 1 319.30549 0.0000000
Density 25572276 1 13.37385 0.0013869
Residuals 42066424 22 NA NA

3.0.0.4 Assumptions

To assess the normality of errors (residuals) we use the QQ (quantile-quantile)-plot again.

qqnorm(mod$residuals)
qqline(mod$residuals)

ggplot() +
  geom_qq(aes(sample=rstandard(mod))) +
  geom_abline(color = "red") +
  theme_classic()

To assess homoskedasticity we look at the plot of the residuals against the fitted values.

plot(residuals(mod) ~ fitted(mod))
abline(h = 0)

autoplot(mod)

Then the Shapiro-Wilks test on the residuals of the model.

shapiro.test(mod$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  mod$residuals
## W = 0.96343, p-value = 0.511

No need to do a levene test since there is no categorical variable included in the model.

3.0.0.5 Data Visualization

This is a simple plot for the simple linear regression model above. There will be more complicated figures and syntax required with more complicated models.

par(mfrow = c(1, 1), mar = c(3.2, 3.2, 2, 0.5), mgp = c(2, 0.7, 0)) #Number of rows of figures, the margins of the figure, and margin line for the axis title labels and line
plot(Simple_Linear_Regression$LbsSucA ~ Simple_Linear_Regression$Density, bty = "l", #Y and X variables for the plot and the "L" box type for the plot area
     ylab = "Sugar yield (lbs/A)", xlab = "Volunteer corn density (plants/ft row)",
     main = "Lingle, WY 2009", ylim = c(0, 10000))
  abline(mod) # Add the regression line
# to add the regression equation into the plot:
  int <- round(summary(mod)$coefficients[[1]]) # get intercept
  sl <- round(summary(mod)$coefficients[[2]]) # get slope
  reg.eq <- paste("Y =", int, sl, "* X") # create text regression equation
  legend("bottomleft",reg.eq, bty = "n")

The results from this model focus on the estimate of the slope (density) which has a negative value. This suggests that as density increases, yield will also decrease.

4 ANCOVA

When you have multiple independent variables in the linear model where at least one is categorical and at least one is continuous that is known as ANCOVA (analysis of covariance). The following steps demonstrate how to perform an analysis of ANCOVA.

4.0.0.1 Review the Experimental Design

By reviewing the data and its characteristics you can confirm the appropriate analysis and understand the context of the hypothesis. This will also be helpful when considering how to visualize the data to help interpret the results.

4.0.0.2 Importing the Data

Remember the “tidy” format…

ANCOVA <- read_csv(here("data", "ANCOVA.csv")) %>%
  mutate(loc = as.factor(loc)) # Convert loc to a factor
## Rows: 84 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): loc
## dbl (2): density, yield
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

4.0.0.3 Data Description

The dataset contains the results from a onion trial that tested the effect of planting density (plants per square meter) at two locations (Purnong Landing or Virginia).

summary(ANCOVA)
##     density           yield        loc   
##  Min.   : 18.78   Min.   : 28.96   P:42  
##  1st Qu.: 39.54   1st Qu.: 78.46   V:42  
##  Median : 61.78   Median :108.90         
##  Mean   : 73.33   Mean   :119.70         
##  3rd Qu.: 97.86   3rd Qu.:153.47         
##  Max.   :184.75   Max.   :272.15

4.0.0.4 Initial Visualization of the Data

ggplot(ANCOVA, aes(x = density, y = yield, color = loc)) +
  geom_point() +
  theme_classic()

4.0.0.5 Fitting ANOVA Models with R

Fitting a linear model with yield as the response variable and the factors of density (continuous) and location (categorical) as the independent variables. Make sure to check the state/type of each variable in the dataframe to make sure the categorical variable is considered a Factor. You can either look in the Environment tab or use the str() function.

We are interested in determining the following 1) The variation in yield among the different locations 2) The variation in yield along the gradient of density 3) The variation in yield along the gradient of density is different between two locations (interaction)

Hypothesis tests are as follows
Main effect of location on yield:
H0: mean yield is equal between locations
H1: mean yield is different between locations
Main effect of density on yield:
H0: slope of trend line between density and yield is zero
H1: slope of trend line between density and yield is not zero
Interaction between location and density:
H0: there is no interaction between location and density, meaning that the relationship between density and yield is the same for both locations
H1: there is an interaction between location and density, meaning that the relationship between density and yield is different between locations

str(ANCOVA)
## tibble [84 × 3] (S3: tbl_df/tbl/data.frame)
##  $ density: num [1:84] 23.5 26.2 27.8 32.9 33.3 ...
##  $ yield  : num [1:84] 223 234 222 222 197 ...
##  $ loc    : Factor w/ 2 levels "P","V": 1 1 1 1 1 1 1 1 1 1 ...
mod <- lm(yield ~ density * loc, data = ANCOVA)

Following the linear model being fit, you can call the parameter estimates using the summary function.

summary(mod)
## 
## Call:
## lm(formula = yield ~ density * loc, data = ANCOVA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -39.054 -16.329  -7.433  14.720 110.485 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  219.65959    8.61694  25.492  < 2e-16 ***
## density       -1.13684    0.09641 -11.792  < 2e-16 ***
## locV         -37.97622   11.67184  -3.254  0.00167 ** 
## density:locV   0.07090    0.13904   0.510  0.61148    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 26.03 on 80 degrees of freedom
## Multiple R-squared:  0.768,  Adjusted R-squared:  0.7593 
## F-statistic: 88.27 on 3 and 80 DF,  p-value: < 2.2e-16

After fitting a model we extract the residuals.

epsilon <- residuals(mod)

The ANOVA table needs the type 3 test for the interaction effect

car::Anova(mod, type = "III")
Sum Sq Df F value Pr(>F)
(Intercept) 440242.8033 1 649.8222443 0.0000000
density 94198.7191 1 139.0424161 0.0000000
loc 7172.0378 1 10.5863166 0.0016701
density:loc 176.1878 1 0.2600627 0.6114811
Residuals 54198.5513 80 NA NA

The interaction effect isn’t significant, but each main effect is. Remember that this test only tells us that there is a significant difference among the levels of the categorical variable. It doesn’t say which levels are significantly different.

4.0.0.6 Assumptions

Assess the normality of errors (residuals) using a QQ (quantile-quantile)-plot.

qqnorm(mod$residuals)
qqline(mod$residuals)

ggplot() +
  geom_qq(aes(sample = rstandard(mod))) +
  geom_abline(color = "red") +
  theme_classic()

Some deviation from the line is normal, but look out for strong evidence of a skewed distribution where the tails are further from the line.

Then assess homoskedasticity with the plot of the residuals against the fitted values.

plot(residuals(mod) ~ fitted(mod))
abline(h=0)

autoplot(mod)

The formal hypothesis used to test the linear model assumptions for normally distributed residuals is the Shapiro-Wilks test on the residuals of the model. The null hypothesis is that the residuals are normally distributed. A significant result p value < 0.05 indicates the residuals are not normally distributed.

shapiro.test(mod$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  mod$residuals
## W = 0.90293, p-value = 1.045e-05

In the presence of results suggesting failure to meet one of the linear model assumptions different stabilizing transformation are available to remove potential biases and produce better estimates from the linear model. Logarithmic transformation are often used for counts and proportions. The square root transformation is also used for counts. The arcsin-square root transformation is very common with proportions.

These transformations were necessary prior to the advancements in computing power that allow for generalized linear models (glm) and generalized linear least squares models (gls) that deal with non-normal and heteroscedastic results. Other options we will discuss include nonparametric methods that make few or no assumptions about the distribution of residuals.

Given the results of the residuals plot and the qqplot we will try different transformations to meet the model assumptions.

mod_log <- lm(log(yield) ~ density * loc, data = ANCOVA)
summary(mod_log)
## 
## Call:
## lm(formula = log(yield) ~ density * loc, data = ANCOVA)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.28094 -0.08451 -0.01393  0.08597  0.46119 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.5469442  0.0453984 122.184  < 2e-16 ***
## density      -0.0096874  0.0005079 -19.072  < 2e-16 ***
## locV         -0.1868138  0.0614932  -3.038  0.00322 ** 
## density:locV -0.0017592  0.0007325  -2.402  0.01864 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1371 on 80 degrees of freedom
## Multiple R-squared:  0.9163, Adjusted R-squared:  0.9132 
## F-statistic: 292.1 on 3 and 80 DF,  p-value: < 2.2e-16
autoplot(mod_log)

mod_sqrt <- lm(sqrt(yield) ~ density * loc, data = ANCOVA)
summary(mod_sqrt)
## 
## Call:
## lm(formula = sqrt(yield) ~ density * loc, data = ANCOVA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.6303 -0.5901 -0.1920  0.6584  3.6758 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  15.248086   0.314155  48.537  < 2e-16 ***
## density      -0.051729   0.003515 -14.717  < 2e-16 ***
## locV         -1.415515   0.425530  -3.326  0.00133 ** 
## density:locV -0.002126   0.005069  -0.419  0.67610    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9489 on 80 degrees of freedom
## Multiple R-squared:  0.8507, Adjusted R-squared:  0.8451 
## F-statistic: 151.9 on 3 and 80 DF,  p-value: < 2.2e-16
autoplot(mod_sqrt)

Neither of the transformations improve the distribution of the residuals, so we will try fitting a model that includes a polynomial term to address the curved linear relationship with the residuals.

mod_2 <- lm(yield ~ poly(density, 2) * loc, data = ANCOVA)
summary(mod_2)
## 
## Call:
## lm(formula = yield ~ poly(density, 2) * loc, data = ANCOVA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -46.719  -9.717   0.843   8.556  80.295 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             138.096      2.842  48.598  < 2e-16 ***
## poly(density, 2)1      -452.487     25.680 -17.620  < 2e-16 ***
## poly(density, 2)2       164.686     25.767   6.391 1.11e-08 ***
## locV                    -35.824      4.014  -8.924 1.52e-13 ***
## poly(density, 2)1:locV   70.837     36.994   1.915   0.0592 .  
## poly(density, 2)2:locV   12.187     36.737   0.332   0.7410    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.15 on 78 degrees of freedom
## Multiple R-squared:   0.89,  Adjusted R-squared:  0.8829 
## F-statistic: 126.2 on 5 and 78 DF,  p-value: < 2.2e-16
autoplot(mod_2) #This greatly improves the residuals plot and qqplot

car::Anova(mod_2, type = "III")
Sum Sq Df F value Pr(>F)
(Intercept) 778268.66 1 2361.813187 0.0000000
poly(density, 2) 107659.44 2 163.357143 0.0000000
loc 26242.18 1 79.637187 0.0000000
poly(density, 2):loc 1246.35 2 1.891151 0.1577523
Residuals 25702.69 78 NA NA

We find the same results that the interaction effect isn’t significant, and we can now move onto the multiple comparisons steps.

4.0.0.7 Expected Marginal Means

We will use the emmeans() function to produce the expected marginal means.

mu_location <- mod_2 %>%
  emmeans(~ loc) %>% # same as , at = list(density = mean(ANCOVA$density)) - conditional effect
  cld(Letters = letters) %>%
  as_tibble() %>% # Converts the object to a tibble
  arrange(emmean) # Arranges the emmean values as ascending values
## NOTE: Results may be misleading due to involvement in interactions
mu_location
loc emmean SE df lower.CL upper.CL .group
V 84.49209 3.992269 78 76.54409 92.44009 a
P 121.54122 3.649864 78 114.27490 128.80754 b

4.0.0.8 Multiple Comparisons

We are comparing the slopes of the two trend lines to see if they are significantly different. The results of the ANOVA table indicate that they are not significantly different. We can review the multiple comparisons results to demonstrate how you would show significant results in the event that future examples show a significant interaction effect.

#emtrends to compare slopes between categorical variables
mu_density_linear <- mod_2 %>%
  emtrends(pairwise ~ loc, var = "density", at = list(density = mean(ANCOVA$density)), deriv = 1) %>% 
  cld(Letters = letters)
mu_density_linear
loc density.trend SE df lower.CL upper.CL .group
P -1.557676 0.0941092 78 -1.745033 -1.370319 a
V -1.397229 0.0853642 78 -1.567177 -1.227283 a
mu_density_quadratic <- mod_2 %>%
  emtrends(pairwise ~ loc, var = "density", deriv = 2) %>%
  cld(Letters = letters)
mu_density_quadratic
loc density.trend SE df lower.CL upper.CL .group
P -1.557676 0.0941092 78 -1.745033 -1.370319 a
V -1.397229 0.0853642 78 -1.567177 -1.227283 a

4.0.0.9 Data Visualization

We need to produce a trend line for each location. We will use the expand.grid() and predict() functions to produce out of sample predictions that will be the trend line. The expand.grid() function would need to include other terms from the model - not relevant in this example. You would set either the mean or median of the other variables since you are testing the relationship between density and yield for both locations - holding all other variables constant.

summary(ANCOVA)
##     density           yield        loc   
##  Min.   : 18.78   Min.   : 28.96   P:42  
##  1st Qu.: 39.54   1st Qu.: 78.46   V:42  
##  Median : 61.78   Median :108.90         
##  Mean   : 73.33   Mean   :119.70         
##  3rd Qu.: 97.86   3rd Qu.:153.47         
##  Max.   :184.75   Max.   :272.15
new.x <- expand.grid(density = seq(18.78, 184.75, length.out=100),
                   loc = levels(ANCOVA$loc))
head(new.x)
density loc
18.78000 P
20.45646 P
22.13293 P
23.80939 P
25.48586 P
27.16232 P
new.y <- predict(mod_2, newdata = new.x, interval = 'confidence')

#Bring new.x and new.y together
addThese <- data.frame(new.x, new.y)
addThese <- rename(addThese, yield = fit)
head(addThese)
density loc yield lwr upr
18.78000 P 235.5113 220.3284 250.6943
20.45646 P 231.1478 216.6635 245.6321
22.13293 P 226.8389 213.0302 240.6477
23.80939 P 222.5847 209.4276 235.7417
25.48586 P 218.3850 205.8551 230.9148
27.16232 P 214.2399 202.3118 226.1680
ggplot(ANCOVA, aes(x = density, y = yield, color = loc)) +
  geom_point(size = 5) +
  geom_smooth(data = addThese, aes(ymin = lwr, ymax = upr, fill = loc), stat = 'identity') +
  scale_color_manual(values = c(P = "blue", V = "red")) +
  scale_fill_manual(values = c(P = "blue", V = "red")) +
  theme_classic()

We see parallel trend lines suggesting no significant difference in the relationship between density and yield among the two locations.

5 Practice Example

We are going to revisit the penguin data to use as practice. We can upload the data again.

Two_Way_Anova<-read_csv(here("data","Two_Way_ANOVA.csv")) %>%
  mutate(island = as.factor(island),
         year = as.factor(year),
         species = as.factor(species))
## Rows: 344 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): species, island, sex
## dbl (5): bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g, year
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Two_Way_Anova <- subset(Two_Way_Anova, !is.na(sex)) #removes NAs within the sex variable
str(Two_Way_Anova)
## tibble [333 × 8] (S3: tbl_df/tbl/data.frame)
##  $ species          : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ island           : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ bill_length_mm   : num [1:333] 39.1 39.5 40.3 36.7 39.3 38.9 39.2 41.1 38.6 34.6 ...
##  $ bill_depth_mm    : num [1:333] 18.7 17.4 18 19.3 20.6 17.8 19.6 17.6 21.2 21.1 ...
##  $ flipper_length_mm: num [1:333] 181 186 195 193 190 181 195 182 191 198 ...
##  $ body_mass_g      : num [1:333] 3750 3800 3250 3450 3650 ...
##  $ sex              : chr [1:333] "male" "female" "female" "female" ...
##  $ year             : Factor w/ 3 levels "2007","2008",..: 1 1 1 1 1 1 1 1 1 1 ...

Example 1 Following the guidelines above test the following hypotheses:

  1. The variation in body mass is equal among the three different islands
  2. The variation in body mass is equal among the three different years
  3. The variation in body mass is equal among the three different islands and three different years (interaction)

Think about which type of model you need to use to test these hypotheses. Report the results from your assessments of the linear model assumptions. Use the multiple comparisons results in a final figure to demonstrate how the results support or refute the above hypotheses.

Example 2 Following the guidelines above test the following hypotheses:

  1. The change in flipper length results in an increase in body mass
  2. The variation in body mass is equal among the three different species
  3. The change in flipper length results in the same increase in body mass among the three different species (interaction)

Think about which type of model you need to use to test these hypotheses. Report the results from your assessments of the linear model assumptions. Use the multiple comparisons results in a final figure to demonstrate how the results support or refute the above hypotheses.


  1. Statistical consultant, ↩︎